Predicting Molecule Toxicity via Descriptor-based Graph Self-supervised Learning

نویسندگان

چکیده

Predicting molecular properties with Graph Neural Networks (GNNs) has recently drawn a lot of attention, compound toxicity prediction being one the biggest challenges. In cases where there is insufficient labeled molecule data, an effective approach to pre-train GNNs on large-scale unlabeled data and then fine-tune them for downstream tasks. Among pre-training strategies, node-level involves masking predicting atom properties, while motif-based methods capture rich information in subgraphs. These approaches have shown effectiveness across various However, current frameworks face two main challenges: (1) auxiliary tasks do not preserve useful domain knowledge, (2) fusion computationally extensive. To address these challenges, we propose Descriptor-based Self-supervised Learning (DGSSL), method that utilizes knowledge enhance graph representation learning. Specifically, it identifies descriptor centers molecules encodes motif-like as special atomic numbers This enables self-supervised also local Experimental results demonstrate our achieves state-of-the-art performance three toxicity-related benchmarks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interpretable Graph-Based Semi-Supervised Learning via Flows

In this paper, we consider the interpretability of the foundational Laplacian-based semi-supervised learning approaches on graphs. We introduce a novel flow-based learning framework that subsumes the foundational approaches and additionally provides a detailed, transparent, and easily understood expression of the learning process in terms of graph flows. As a result, one can visualize and inter...

متن کامل

Graph-Based Semi-Supervised Learning

While labeled data is expensive to prepare, ever increasing amounts of unlabeled data is becoming widely available. In order to adapt to this phenomenon, several semi-supervised learning (SSL) algorithms, which learn from labeled as well as unlabeled data, have been developed. In a separate line of work, researchers have started to realize that graphs provide a natural way to represent data in ...

متن کامل

Reblur2Deblur: Deblurring Videos via Self-Supervised Learning

Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference. Traditional deblurring algorithms leverage the physics of the image formation model and use hand-crafted priors: they usually produce results that better reflect the underlying scene, but present artifacts. Recent learning-based methods implicitly extract the distribution of natural images...

متن کامل

Parallel Graph-Based Semi-Supervised Learning

Semi-supervised learning (SSL) is the process of training decision functions using small amounts of labeled and relatively large amounts of unlabeled data. In many applications, annotating training data is time-consuming and error prone. Speech recognition is the typical example, which requires large amounts of meticulously annotated speech data (Evermann et al., 2005) to produce an accurate sy...

متن کامل

pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures

Drug development has a high attrition rate, with poor pharmacokinetic and safety properties a significant hurdle. Computational approaches may help minimize these risks. We have developed a novel approach (pkCSM) which uses graph-based signatures to develop predictive models of central ADMET properties for drug development. pkCSM performs as well or better than current methods. A freely accessi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3308203